LexA DNA binding domain optimized for arabidopsis species

ABSTRACT

A synthetic nucleotide sequence encodes the LexA DNA binding domain, the nucleotide sequence having been modified to bring the codon usage in conformity with the preferred codon usage of  Arabidopsis thaliana.  The preferred sequence of the gene is provided as SEQ ID NO: 1. DNA constructs, transformed host cells and transgenic plants comprising the synthetic nucleotide sequence are also provided.

CROSS-REFERENCES TO RELATED APPLICATIONS

[0001] This application is a divisional of U.S. patent application Ser.No. 09/401,171, filed on Sept. 22, 1999, the contents of which isincorporated by reference.

FIELD OF THE INVENTION

[0002] This invention relates to increasing the expression ofheterologous or chimeric proteins in plants, and more specifically toincreasing protein expression in Arabidopsis species and otherdicotyledons.

BACKGROUND OF THE INVENTION

[0003] One of the primary goals of plant genetic research anddevelopment is the production of transgenic plants that express aheterologous gene (i.e., produce a “foreign” protein) in an amountsufficient to confer a desired phenotype to the plant. While significantadvances have been made in pursuit of this goal, the expression ofcertain heterologous genes in transgenic plants remains problematic. Itis thought that numerous factors are involved in determining theultimate level of expression of a heterologous gene in a plant. Theamount of protein that is synthesized from a gene is a function ofseveral complex and interrelated events, including transcription, RNAmaturation, translation, and post-translational modification. Each ofthese processes is comprised of a large number of events, all of whichare potentially regulated either independently or in concert.

[0004] The genetic code is considered “degenerate” in that more than onenucleic acid triplet (i.e., a “codon”) encodes the same amino acid. Manyamino acids may be coded for by several different codons. In general,genes within a taxonomic group exhibit similarities in codon choice,regardless of the function of these genes. Thus, an estimate of theoverall use of the genetic code by a taxonomic group can be obtained bysumming codon frequencies of all its sequenced genes. Variation betweendegenerate base frequencies is not a neutral phenomenon, sincesystematic codon preferences have been reported for bacterial, yeast,plant and mammalian genes. Bias in codon choice within genes in a singlespecies appears related to the level of expression of protein encoded byany particular gene.

[0005] Codon bias is most extreme in highly expressed proteins ofbacteria (e.g., E. coli) and yeast. In unicellular organisms, highlyexpressed genes use a smaller subset of codons than do weakly expressedgenes, although the codons preferred are distinct in some cases. Sharpand Li, Nucl. Acids Res. 14, 7734-7749 (1986), report that codon usagein 165 E. coli genes reveals a positive correlation between highexpression and increased codon bias. In these organisms, a strongpositive correlation has been reported between the abundance of anisoaccepting tRNA species and the favored synonymous codon. For example,in one group of highly expressed proteins in yeast, over 96% of theamino acids are encoded by only 25 of the 61 available codons. SeeBennetzen and Hall, J. Biol. Chem. 257, 3026-3031. (1982). These 25codons are preferred in all sequenced yeast genes, but the degree ofpreference varies with the level of expression of the genes. Biasedcodon choice in highly expressed genes appears to enhance translation,and is required for maintaining mRNA stability in yeast. It has beenproposed that the good fit of abundant yeast and E. coli mRNA codonusage to isoacceptor tRNA abundance promotes high translation levels andhigh steady state levels of these proteins. These results stronglysuggest that the potential for high levels of expression of plant genesin yeast or E. coli is limited by their codon usage, and conversely thathigh levels of expression of E. Coli or yeast genes in plant cells issimilarly limited by the preferred codon usage of the differentorganisms.

[0006] Although plant codon usage patterns are distinct from thosereported for bacteria, yeast, and animals, in general plant codon usagepattern more closely resembles that of higher eukaryotes thanunicellular organisms, due to the overall preference for G+C content incodon position III. Moreover, analysis of a large group of plant genesequences indicates that synonymous codons are used differently bymonocots and dicots. Wilbur et al., Plant Physiol. 92: 1-11 (1990),describes the difference in codon usage between bacteria and higherplants such as dicotyledonous and monocotyledonous plants. For example,the codon usages for codons XCG and XUA are 1.8% and 3.2% indicotyledonous plants and 6.3% and 1.4% in monocotyledonous plants. Thecombined codon usage for codons XXC and XXG (hereinafter, referred to asthe codon XXC/G usage, wherein each of the two Xs is independentlyselected from the group consisting of A, G, C and T) is 45% indicotyledon and 73.5% in monocotyledon. It is well established that GCcontent in genes which can be translated is higher in monocotyledon suchas gramineous plants, e.g., rice plants, than in dicotyledons. As tobacteria, the codon usage apparently varies by strain

[0007] In this regard, investigators have determined that typical plantstructural coding sequences preferentially utilize certain codons toencode certain amino acids in a different frequency than the frequencyof usage appearing in bacterial or other non-plant coding sequences.Thus, it has been suggested that the differences between the typicalcodon usage present in plant coding sequences as compared to the typicalcodon usage present in non-plant coding sequence is a factorcontributing to the low levels of non-plant mRNA and non-plant proteinproduced in transgenic plants. These differences in codon usage maycontribute to the low levels of mRNA or protein expressed by thenon-plant coding sequence in a transgenic plant by affecting thetranscription or translation of the coding sequence or proper mRNAprocessing.

[0008] Recently, attempts have been made to alter the structural codingsequence of a desired polypeptide or protein in an effort to enhance itsexpression in the plant. In particular, investigators have altered thecodon usage of heterologous, structural coding sequences (i.e.,heterologous genes) in an attempt to enhance their expression in plants.Most notably, the sequence encoding insecticidal crystal proteins ofBacillus thuringiensis (Bt) has been modified in various ways to enhanceits expression in a plant, particularly monocotyledonous plants, toproduce commercially viable insect-tolerant plants.

[0009] U.S. Pat. No. 5,380,831 to Adang et al. describes synthetic Btgenes designed to be expressed at a level higher thannaturally-occurring Bt genes. The genes utilize codons preferred inhighly expressed monocot or dicot protein. Specifically, the syntheticgenes, while about 85% homologous to the native bacterial sequence, arechemically modified to contain codons that are preferred by highlyexpressed plant genes, and to eliminate undesirable sequences that causedestabilization, termination of RNA, secondary structures and RNA splicesites.

[0010] U.S. Pat. No. 5,436,391 to Fujimoto et al. describes a syntheticgene encoding the insecticidal protein Bt. The gene is provided having abase sequence which has been modified to bring the codon usage inconformity with the genes of graminaceous plants, particularly riceplants (e.g., oryza).

[0011] U.S. Pat. No. 5,689,052 to Brown et al. describes a method formodifying a foreign nucleotide sequence for enhanced accumulation of itsprotein product in a monocotyledonous plant, and/or increasing thefrequency of obtaining transgenic monocotyledonous plants whichaccumulate useful amounts of a transgenic protein, by reducing thefrequency of the rare and semi-rare monocotyledonous codons in theforeign gene and replacing them with more preferred monocotyledonouscodons.

[0012] Another approach to altering the codon usage of a Bt toxin geneto enhance its expression in plants is described in U.S. Pat. No.5,500,365 to Fischhoff et al. Here, the synthetic plant gene wasprepared by modifying the coding sequence to remove all ATTTA sequencesand certain identified putative polyadenylation signals. Moreover, thegene sequence was scanned to identify regions with greater than fourconsecutive adenine or thymine nucleotides. If there were more than oneof the minor polyadenylation signals identified within ten nucleotidesof each other, then the nucleotide sequence of this region was alteredto remove these signals while maintaining the original encoded aminoacid sequence. The overall G+C content was also adjusted to provide afinal sequence having a G+C ratio of about 50%. Similarly, U.S. Pat. No.5,877,306 Cornelissen et al. discloses a method of modifying a DNAsequence encoding a Bt crystal protein toxin wherein the gene wasmodified by reducing the A+T content. This was accomplished by changingthe adenine and thymine bases to cytosine and guanine, while maintaininga coding sequence for the original protein toxin.

[0013] While the foregoing examples emphasize the modification oroptimization of codon usage in heterologous structural genes (i.e., thegenes encoding a desired protein product, such as Bt toxin), themodification of regulatory elements that control transcription byoptimizing codon usage in the host plant has not been emphasized. It iswidely recognized that the upstream regulatory elements that controltranscription and translation have very significant roles in determiningthe quantity, timing, and tissue specificity of gene expression. Variousnucleotide sequences other than the heterologous structural codingsequence affect the expression levels of a foreign DNA sequenceintroduced into a plant, including promoter sequences, intron sequences,3′ untranslated sequences, polyadenylation sites, and other regulatorysequences.

[0014] In view of the foregoing, activation and control of transcriptionare processes that may desirably be manipulated in order to achievealtered (i.e., increased or decreased) expression of a heterologousstructural gene in a plant cell. Transcription can be activated throughthe use of two functional domains of a transcription activation moiety:a domain (i.e., sequence of amino acids) that recognizes and binds to aspecific site or sequence of nucleotides on a target DNA, (the DNAbinding domain); and a domain that is capable of activatingtranscription of the DNA when physically associated with the DNA-bindingdomain and which may be necessary for activation of the target gene (theactivation domain). See Keegan, et al., Science 231, 669-704 (1986); Maand Ptashne, Cell 48, 847-853 (1987). The two functional domains may bederived from a single transcription activation protein. Alternatively,it has been shown that these two functions can also reside on separateproteins. See McKnight et al., Proc. Natl. Acad. Sci. USA 89, 7061-7065(1987); Curran et al. 55, 395-397 (1988). The transcription activationdomains may also be derived from synthetic DNA-binding and transcriptionactivation proteins.

[0015] Additional flexibility in controlling heterologous geneexpression in plants may be obtained by using DNA binding domains andresponse elements from heterologous sources (i.e., DNA binding domainsfrom non-plant sources). Some examples of such heterologous DNA bindingdomains include the LexA and GAL4 DNA binding domains. The LexADNA-binding domain is part of the repressor protein LexA fromEscherichia coli (E. Coli) (Brent and Ptashne, Cell 43:729-736 (1985)).

[0016] Although the LexA DNA binding domain functions as an efficientDNA binding domain in its natural bacterial host, when transferred byrecombinant DNA technology into higher eukaryotes such as plants, thedomain is not efficiently expressed. Accordingly, it would be desirableto alter the LexA DNA binding domain to increase its expression inhigher eukaryotes such as plants.

SUMMARY OF THE INVENTION

[0017] Certain objects, advantages and novel features of the inventionwill be set forth in the description that follows, and will becomeapparent to those skilled in the art upon examination of the following,or may be learned with the practice of the invention.

[0018] It is an object of the invention to provide a synthetic LexA DNAbinding domain optimized for codon usage in plants, more specifically indicots, and most specifically in Arabidopsis thaliana.

[0019] The invention relates to adapting the codons of the DNA bindingdomain of the LexA gene from E. Coli to the codon usage of Arabidopsisthaliana. This method is advantageous in that it allows for theincreased expression of heterologous and chimeric proteins containingthis artificial DNA binding domain.

[0020] Additional aspects of this invention include constructs (i.e.,vectors, DNA fusions and polynucleotides), comprising the synthetic DNAsequence of the present invention. These constructs are useful forincreasing heterologous protein expression in plant cells. Furtheraspects of the invention are cells, plant lines, and transgenic plantstransformed with the described constructs. Methods of increasingexpression of heterologous proteins in a cell or a transgenic plant arean additional aspect of the present invention.

[0021] The foregoing and other aspects of the present invention areexplained in detail in the specification set forth below.

BRIEF DESCRIPTION OF THE DRAWINGS

[0022]FIG. 1 sets forth the nucleotide and amino acid sequences of theLexA DNA binding domain optimized for Arabidopsis thaliana codon usage.

[0023]FIG. 2 sets forth the nucleotide sequences of the sevenoligonucleotides (SEQ ID NO: 3-9) that were annealed together to arriveat the LexA DNA binding domain optimized for Arabidopsis thaliana codonusage. SEQ ID NO: 10 is the LexA activation domain to which the LexAbinding domain normally binds.

[0024]FIG. 3 is a schematic representation of the plasmid pUCNLSTBP1.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0025] The present invention will now be described more fullyhereinafter with reference to the accompanying figures, in whichpreferred embodiments of the invention are shown. This invention may,however, be embodied in different forms and should not be construed aslimited to the embodiments set forth herein. Rather, these embodimentsare provided so that this disclosure will be thorough and complete, andwill fully convey the scope of the invention to those skilled in theart.

[0026] Unless otherwise defined, all technical and scientific terms usedherein have the same meaning as commonly understood by one of ordinaryskill in the art to which this invention belongs. All publications,patent applications, patents, and other references mentioned herein areincorporated by reference in their entirety.

[0027] Amino acid sequences disclosed herein are presented in the aminoto carboxy direction, from left to right. The amino and carboxy groupsare not presented in the sequence. Nucleotide sequences are presentedherein by single strand only, in the 5′ to 3′ direction, from left toright. Nucleotides and amino acids are represented herein in the mannerrecommended by the IUPAC-IUB Biochemical Nomenclature Commission, or(for amino acids) by three letter code, in accordance with 37 CFR §1.822and established usage. See, e.g., Patent In User Manual, 99-102(November 1990) (U.S. Patent and Trademark Office).

[0028] The term “amino acid sequence,” as used herein, refers to eithera naturally occurring or a synthetic oligopeptide, peptide, polypeptide,or protein sequence, and fragments thereof. Where “amino acid sequence”is recited herein to refer to an amino acid sequence of a naturallyoccurring protein molecule, the term “amino acid sequence,” and liketerms, are not meant to limit the amino acid sequence to the complete,native amino acid sequence associated with the recited protein molecule.

[0029] The term “nucleic acid sequence” as used herein refers to anucleotide, oligonucleotide, or polynucleotide, and fragments thereof,and to DNA or RNA of genomic or synthetic origin which may be single- ordouble-stranded, and which may represent a sense or antisense strand.

[0030] “Chemically synthesized,” as related to a sequence of DNA, meansthat the component nucleotides are assembled in vitro. Manual chemicalsynthesis of DNA may be accomplished using well established procedures(Caruthers, M., in Methodology of DNA and RNA Sequencing, Chapter 1(Weissman (ed.), Praeger Publishers, New York, (1983)). Alternatively,automated chemical synthesis of DNA can be performed using one of anumber of commercially available apparatus.

[0031] As used herein, the term “LexA” or “LexA binding domain” refersto a protein or domain that is naturally encoded in E. coli by the lexAgene, which domain normally binds to the DNA sequenceCATACTGTATGAGCATACAG (the LexA activation domain) [SEQ ID NO: 10]. Theterm “LexA binding domain adapted (or optimized) for Arabidopsisthaliana usage” means a protein domain in which the codons encoding thedomain have been modified from the natural E. Coli sequence in order tomaximize DNA binding (and accordingly, transcription) in Arabidopsisthaliana.

[0032] As used herein, the term “expression” refers to the transcriptionand translation of a heterologous or homologous gene to yield theprotein encoded by the gene.

[0033] “Heterologous” is used to indicate that a nucleic acid sequence(e.g., a gene) or a protein has a different natural origin or sourcewith respect to its current host. “Heterologous” is also used toindicate that one or more of the domains present in a protein differ intheir natural origin with respect to other domains present.

[0034] As used herein, a “structural gene” is that portion of a genecomprising a DNA segment encoding a protein, polypeptide or a portionthereof, and excluding the 5′ sequence which drives the initiation oftranscription. The structural gene may be one which is normally found inthe cell or one which is not normally found in the cellular locationwherein it is introduced, in which case it is termed a heterologousgene. A heterologous gene may be derived in whole or in part from anysource known to the art, including a bacterial genome or episome,eukaryotic, nuclear or plasmid DNA, cDNA, viral DNA or chemicallysynthesized DNA. A structural gene may contain one or more modificationsin either the coding or the untranslated regions which could affect thebiological activity or the chemical structure of the expression product,the rate of expression or the manner of expression control. Suchmodifications include, but are not limited to, mutations, insertions,deletions and substitutions of one or more nucleotides. The structuralgene may constitute an uninterrupted coding sequence or it may includeone or more introns, bounded by the appropriate splice junctions. Thestructural gene may be a composite of segments derived from a pluralityof sources, either naturally occurring or synthetic, or both. Whensynthesizing a gene for improved expression in a host cell it isdesirable to design the gene such that its frequency of codon usageapproaches the frequency of preferred codon usage of the host cell.

[0035] As used herein, the term “chimeric protein” is used to indicatethat the protein is comprised of domains, at least one of which has anorigin or source that is heterologous with respect to the other domainspresent. Chimeric proteins are encoded by nucleotide sequences that havebeen fused or ligated together, resulting in a coding sequence that doesnot occur naturally. “Chimeric sequences” or genes refer to nucleic acidsequences containing at least two heterologous parts, e.g., partsderived from naturally occurring nucleic acid sequences that are notassociated in their naturally occurring states, or containing at leastone part that is of synthetic origin and not found in nature.

[0036] To improve expression of the DNA-binding domain of the LexArepressor protein in plants, the codon usage of the natural, E. ColiLexA DNA binding domain was modified and optimized to the codon usage ofArabidopsis thaliana. The characteristics of codon usage for Arabidopsisthaliana is based on the report of Wada et al., “Codon Usage TabulatedFrom The GenBank Genetic Sequence Data,” Nucleic Acids Research 19(Supp.) 1981-1986 (1991). Table 1, below, based upon the data providedin the Wada et al. article, illustrates the differences between thecodons preferred by E. Coli for each amino acid vs. the codons preferredby A. thaliana for each amino acid (the codon listed in the table beingthe most preferred codon, according to the Wada et al. data).

[0037] In order to construct a synthetic DNA sequence in accordance withthe present invention, the amino acid sequence of the E. Coli LexA DNAbinding domain is determined and back-translated into all the availablecodon choices for each amino acid. The amino acid sequence of theprotein can be analyzed using commercially available computer softwaresuch as the BACKTRANSLATE® program of the GCG Sequence Analysis softwarepackage. After the sequence is back-translated, the codons that arepreferentially and optimally used in Arabidposis thaliana aresubstituted for the naturally occurring codons of the E Coli. LexAbinding domain. In other words, for each amino acid of the LexA DNAbinding domain that may be encoded for by more than one codon, the codonpreferred by Arabidopsis thaliana is placed in the sequence in place ofthe original codon. The result of the back-translation of the LexA DNAbinding protein into codons that are preferred by Arabidopsis thaliana,when all codons of the E. Coli LexA binding domain are replaced with thepreferred codons of A. thaliana, results in a DNA sequence of 258nucleotides that encodes the LexA DNA binding domain optimized for usein Arabidopsis thaliana. TABLE 1 Preferred Codon Usage of E. Coli vs. A.thaliana. Data taken from Wada et al., “Codon Usage Tabulated From TheGenBank Genetic Sequence Data,” Nucleic Acids Research 19 (Supp.)1981-1986 (1991) Preferred Preferred Amino Acid Codon For E. Coli Codonfor A. thaliana ARG CGT AGA LEU CTG CTT SER AGC TCT THR ACC ACC PRO CCGCCA ALA GCG GCT GLY GGC GGA VAL GTT GTT LYS AAA AAG ASN AAC AAC GLN CAGCAG HIS CAT CAC GLU GAA GAG ASP GAT GAT TYR TAT TAC CYS TGC TGC PHE TTTTTC ILE ATC ATC MET ATG ATG TRP TGG TGG

[0038] As further explained herein in the Examples, and in one preferredembodiment of the invention, after the DNA sequence encoding the LexADNA binding domain completely optimized for use in Arabidopsis thalianais determined, the 258 nucleotides representing one strand of the DNA isarbitrarily divided into three oligonucleotide sequences. The 258nucleotides representing the other, complementary strand of the DNA isarbitrarily divided into four oligonucleotide sequences. Theoligonucleotide sequences are then chemically synthesized into sevenoligomers, each being phosphorylated at their 5′ end. In addition, fourof the oligomers have additional nucleotides added to the ends in orderto create “sticky ends” (restriction sites) for use in ligating the DNAinto recombinant vectors. After the seven oligomers are synthesized, theoligomers are annealed together, to yield the following synthetic DNAsequence encoding the LexA DNA binding domain codon optimized for usagein Arabidopsis thaliana:

[0039] ATGAAGGCTCTTACCGCTACCGCTAGACAGCAGGAGGTTTTCGATCTTATCAGAGATCACATCTCTCAGACCGGAATGCCACCACCAACCAGAGCTGAGATCGCTCAGAGACTTGGATTCAGATCTCCAAACGCTGCTGAGGAGCACCTTAAGGCTCTTGCTAGAAAGGGAGTTATCGAGATCGTTTCTGGAGCTTCTAGAGGAATCAGACTTCTTCAGGAGGAGGAGGAGGGACTTCCACTTGTTGGAAGAGTTGCTGCTGGAGAG (SEQ ID NO: 1, also set forth in FIG. 1)

[0040] which DNA sequence translates into the following amino acidsequence of the LexA DNA binding domain codon optimized for Arabidopsisthaliana:

[0041]Met-Lys-Ala-Leu-Thr-Ala-Arg-Gln-Gln-Glu-Val-Phe-Asp-Leu-Ile-Arg-Asp-His-Ile-Ser-Gln-Thr-Gly-Met-Pro-Pro-Thr-Arg-Ala-Glu-Ile-Ala-Gln-Arg-Leu-Gly-Phe-Arg-Ser-Pro-Asn-Ala-Ala-Glu-Glu-His-Leu-Lys-Ala-Leu-Ala-Arg-Lys-Gly-Val-Ile-Glu-Ile-Val-Ser-Gly-Ala-Ser-Arg-Gly-Ile-Arg-Leu-Leu-Gln-Glu-Glu-Glu-Glu-Gly-Leu-Pro-Leu-Va-IGly-Arg-Val-Ala-Ala-Gly-Glu(SEQ ID NO: 2, also set forth in FIG. 1).

[0042] The foregoing description describes a particularly preferredembodiment of the invention in which every codon encoding the normal, E.Coli LexA DNA binding domain is substituted with a corresponding codonthat is preferred by A. thaliana (i.e., is completely or 100 percentoptimized for usage by A. thaliana), when the two codons are different(i.e., when a particular amino acid may be encoded by more than onecodon, and a codon preferred by E. Coli differs from one preferred by A.thaliana). The invention also encompasses synthetic nucleotide sequencesand proteins that are partially (i.e., less than 100 percent) optimizedfor usage by A. thaliana, and the uses therefor. In this alternativeembodiment of the invention, using the methods described above, lessthan all of amino acids of the LexA DNA binding domain that may beencoded for by more than one codon is replaced in the synthetic sequencewith a codon preferred by A. thaliana. Preferably, more than about 50percent of the codons of the LexA binding domain are replaced in thesynthetic sequence with a codon preferred by A. thaliana; morepreferably, more than about 80 percent of the codons of the LexA bindingdomain are replaced in the synthetic sequence with a codon preferred byA. thaliana; most preferably, 100 percent of the codons of the LexAbinding domain are codons that are preferred by A. thaliana.

[0043] The synthetic nucleotide sequences and proteins set forth in thepresent invention are described as being optimized for usage inArabidopsis species, particularly Arabidopsis thaliana; however, theskilled artisan will appreciate that the synthetic nucleotide sequencesand proteins encoding the Lex A DNA binding domain optimized forArabidopsis usage are useful in non-bacterial species other thanArabidopsis thaliana. The LexA DNA binding domain set forth herein willbe more efficiently expressed in higher eukaryotes (e.g., plants andanimals), and more specifically will be more efficiently expressed indicotyledenous plants, which include but are by no means limited tospecies of legumes (from the family Fabaceae), including soybean,peanut, and alfalfa; species of the Solanaceae family such as tomato,eggplant and potato; species of the family Brassicaceae such as cabbage,turnips and rapeseed; species of the family Rosaceae such as apples,pears and berries; and members of the families Cucurbitaceae(cucumbers), Chenopodiaceae (beets) and Umbelliferae (carrots).

[0044] The present invention provides an advantageously modified DNAbinding domain for the enhanced expression of desired heterologousprotein genes in transgenic plants. To this end, one embodiment of thepresent invention is a DNA construct comprising a DNA sequence encodingthe LexA synthetic DNA binding domain of the invention. Such DNAconstructs accordingly provide for the preparation of stably transformedcells expressing heterologous protein, which transformed cells are alsoan aspect of the invention. Still further, the synthetic DNA bindingdomain of the present invention provides for the subsequent regenerationof fertile, transgenic plants and progeny containing desiredheterologous protein genes. These aspects of the invention are furtherdescribed herein below.

[0045] DNA constructs (also referred to herein as DNA vectors) of thepresent invention comprise the nucleotide sequence of the synthetic LexA binding domain described herein, which nucleotide sequence ispreferably the sequence provided herein as SEQ ID NO: 1. The preparationof DNA constructs is well known in the art. See, e.g., Sambrook et al.,Molecular Cloning: A Laboratory Manual (1989). The DNA constructs of thepresent invention are useful in the transformation of cells (e.g., plantcells), and thus useful in the expression of heterologous genes in thecells. The expression of a heterologous DNA sequence (i.e., gene) in aplant requires proper transcriptional initiation regulatory regions thatare recognized in the host plant to be transformed, with the regionslinked in a manner which permits the transcription of the codingsequence and subsequent processing in the nucleus. Thus, a DNA constructpreferably contains some or all of the necessary elements to permit thetranscription and ultimate expression of the coding sequence in the hostplant.

[0046] DNA constructs of the present invention may contain suitablepromoters for the expression of heterologous genes in plants. The term“promoter” refers to the nucleotide sequences at the 5′ end of astructural gene which direct the initiation of transcription. Generally,promoter sequences are necessary, but not always sufficient, to drivethe expression of a downstream gene. In the construction of heterologouspromoter/structural gene combinations, the structural gene is placedunder the regulatory control of a promoter such that the expression ofthe gene is controlled by promoter sequences. The promoter is positionedpreferentially upstream to the structural gene and at a distance fromthe transcription start site that approximates the distance between thepromoter and the gene it controls in its natural setting. As is known inthe art, some variation in this distance can be tolerated without lossof promoter function. As used herein, the term “operatively linked”means that a promoter is connected to a coding region in such a way thatthe transcription of that coding region is controlled and regulated bythat promoter. Means for operatively linking a promoter to a codingregion are well known in the art.

[0047] For expression in plants, suitable promoters must be chosen forthe host cell, the selection of which promoters is well within the skillof one knowledgeable in the art. Promoters useful in the practice of thepresent invention include, but are not limited to, constitutive,inducible, temporally regulated, developmentally regulated, chemicallyregulated, tissue-preferred and tissue-specific promoters.

[0048] Numerous promoters are known or are found to facilitatetranscription of RNA in plant cells and can be used in the DNA constructof the present invention. Examples of suitable promoters include thenopaline synthase (NOS) and octopine synthase (OCS) promoters, thelight-inducible promoter from the small subunit of ribulosebis-phosphate carboxylase promoters, the CaMV 35S and 19S promoters, thefull-length transcript promoter from Figwort mosaic virus, histonepromoters, tubulin promoters, or the mannopine synthase promoter (MAS).The promoter may also be one that causes preferential expression in aparticular tissue, such as leaves, stems, roots, or meristematic tissue,or the promoter may be inducible, such as by light, heat stress, waterstress or chemical application or production by the plant. Exemplarygreen tissue-specific promoters include the maize phosphoenol pyruvatecarboxylase (PEPC) promoter, small submit ribulose bis-carboxylasepromoters (ssRUBISCO) and the chlorophyll a/b binding protein promoters.

[0049] Additional promoters useful in the present invention include butare not limited to one of several of the actin genes, which are known tobe expressed in most cell types. Yet another constitutive promoteruseful in the practice of the present invention is derived fromubiquitin, which is another gene product known to accumulate in manycell types. The ubiquitin promoter has been cloned from several speciesfor use in transgenic plants (e.g., sunflower (Binet et al., PlantScience 79: 87-94 (1991); and maize (Christensen et al., Plant Molec.Biol. 12, 619-632 (1989)). Further useful promoters are the U2 and U5snRNA promoters from maize (Brown et al., Nucleic Acids Res. 17, 8991(1989)) and the promoter from alcohol dehydrogenase (Dennis et al.,Nucleic Acids Res. 12, 3983 (1984)).

[0050] Tissue-specific or tissue-preferential promoters useful in thepresent invention in plants are those which direct expression in root,pith, leaf or pollen. Such promoters are disclosed in U.S. Pat. No.5,625,136 (herein incorporated by reference in its entirety). Alsouseful are promoters which confer seed-specific expression, such asthose disclosed by Schernthaner et al., EMBO J. 7: 1249 (1988);anther-specific promoters ant32 and ant43D; anther (tapetal) specificpromoter B6 (Huffman et al., J. Cell. Biochem. 17B, Abstract #D209(1993)); and pistil-specific promoters such as a modified S13 promoter(Dzelkalns et al., Plant Cell 5,855 (1993)).

[0051] Other plant promoters may be obtained, preferably from plants orplant viruses, and may be utilized so long as the selected promoter iscapable of causing sufficient expression in a plant resulting in theproduction of an effective amount of the desired protein. Preferredconstitutive promoters include but are not limited to the CaMV 35S and19S promoters (see U.S. Pat. No. 5,352,605, the disclosure of which isincorporated herein in its entirety). Any promoter used in the presentinvention may be modified, if desired, to alter their controlcharacteristics. For example, the CaMV 35S or 19S promoters may beenhanced by the method described in Kay, et al. Science (1987) Vol. 236,pp.1299-1302.

[0052] The DNA sequences that comprise the DNA constructs of the presentinvention are preferably carried on suitable vectors, which are known inthe art. Preferred vectors for are plasmids that may be propagated in aplant cell. Particularly preferred vectors for transformation are thoseuseful for transformation of plant cells or of Agrobacteria, asdescribed further below. For Agrobacterium-mediated transformation, thepreferred vector is a Ti-plasmid derived vector. Other appropriatevectors which can be utilized as starting materials are known in theart. Suitable vectors for transforming plant tissue and protoplasts havebeen described by deFramond, A. et al., Bio/Technology 1, 263 (1983);An, G. et al., EMBO J. 4, 277 (1985); and Rothstein, S. J. et al., Gene53, 153 (1987). In addition to these, many other vectors have beendescribed in the art which are suitable for use as starting materials inthe present invention.

[0053] The DNA encoding the synthetic LexA binding domain of the presentinvention, and the DNA constructs comprising them, have applicability toany structural gene that is desired to be introduced into a plant toprovide any desired characteristic in the plant, such as herbicidetolerance, virus tolerance, insect tolerance, disease tolerance, droughttolerance, or enhanced or improved phenotypic characteristics such asimproved nutritional or processing characteristics.

[0054] In a particularly preferred embodiment of the invention, DNAconstructs of the present invention also comprise DNA sequences encodingtransactivation domains. Transactivation domains can be defined as aminoacid sequences that, when combined with the DNA binding domain, increaseproductive transcription initiation by RNA polymerases. (See generallyPtashne, Nature 335, 683-689 (1988)). Different transactivation domainsare known to have different degrees of effectiveness in their ability toincrease transcription initiation. In the present invention it isdesirable to use transactivation domains that have superiortransactivating effectiveness in plant cells in order to create a highlevel of target polypeptide expression in response to the presence ofchemical ligand.

[0055] Transactivation domains that have been shown to be particularlyeffective in the method of the present invention include but are notlimited to VP16 (isolated from the herpes simplex virus), C1 (isolatedfrom maize), and Thm18 (isolated from tomato). One preferred example forthe use of codon-optimized LexA DNA-binding domain is the in-framefusion of this DNA-binding domain to other domains that have otherfunctions; for example, the fusion to transcription activator domainslike VP16 from Herpes simplex or Thm18 from tomato, or the fusion toTATA-box binding proteins (TBPs) such as the TBP1 protein fromArabidopsis thaliana. Other transactivation domains may also beeffective.

[0056] Transgenes (heterologous genes to be transformed into a plantcell) will often be genes that direct the expression of a particularprotein or polypeptide product, but they may also be non-expressible DNAsegments, e.g., transposons that do not direct their own transposition.As used herein, an “expressible gene” is any gene that is capable ofbeing transcribed into RNA (e.g., mRNA, antisense RNA, etc.) ortranslated into a protein, expressed as a trait of interest, or thelike, etc., and is not limited to selectable, screenable ornon-selectable marker genes. The invention also contemplates that, whereboth an expressible gene that is not necessarily a marker gene isemployed in combination with a marker gene, one may employ the separategenes on either the same or different DNA segments for transformation.In the latter case, the different vectors are delivered concurrently torecipient cells to maximize cotransformation

[0057] Any heterologous gene or nucleic acid that is desired to beexpressed in a plant is suitable for the practice of the presentinvention. Heterologous genes to be transformed and expressed in theplants of the present invention include but are not limited to genesthat encode resistance to diseases and insects, genes conferringnutritional value, genes conferring antifungal, antibacterial orantiviral activity, and the like. Alternatively, therapeutic (e.g., forveterinary or medical uses) or immunogenic (e.g., for vaccination)peptides and proteins can be expressed in plants transformed with thesynthetic DNA LexA DNA binding domain of the present invention.Likewise, the transfer of any nucleic acid for controlling geneexpression in a plant is contemplated as an aspect of the presentinvention. For example, the nucleic acid to be transferred can encode anantisense oligonucleotide. Alternately, plants may be transformed withone or more genes to reproduce enzymatic pathways for chemical synthesisor other industrial processes.

[0058] In order to improve the ability to identify transformants, onemay desire to employ a selectable or screenable marker gene as, or inaddition to, the expressible gene of interest. “Marker genes” are genesthat impart a distinct phenotype to cells expressing the marker gene andthus allow such transformed cells to be distinguished from cells that donot have the marker. Such genes may encode either a selectable orscreenable marker, depending on whether the marker confers a trait whichone can ‘select’ for by chemical means, i.e., through the use of aselective agent (e.g., a herbicide, antibiotic, or the like), or whetherit is simply a trait that one can identify through observation ortesting, i.e., by ‘screening’ (e.g., the R-locus trait). Of course, manyexamples of suitable marker genes are known to the art and can beemployed in the practice of the invention. The selectable marker genemay be the only heterologous gene expressed by a transformed cell, ormay be expressed in addition to another heterologous gene transformedinto and expressed in the transformed cell. Selectable marker genes areutilized for the identification and selection of transformed cells ortissues. Selectable marker genes include genes encoding antibioticresistance, such as those encoding neomycin phosphotransferase II (NEO)and hygromycin phosphotransferase (HPT), as well as genes conferringresistance to herbicidal compounds. Herbicide resistance genes generallycode for a modified target protein insensitive to the herbicide or foran enzyme that degrades or detoxifies the herbicide in the plant beforeit can act. See, DeBlock et al., EMBO J. 6, 2513 (1987); DeBlock et al.,Plant Physiol. 91, 691 (1989); Fromm et al., BioTechnology 8, 833(1990); Gordon-Kamm et al., Plant Cell 2, 603 (1990). For example,resistance to glyphosphate or sulfonylurea herbicides has been obtainedusing genes coding for the mutant target enzymes,5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) and acetolactatesynthase (ALS). Resistance to glufosinate ammonium, boromoxynil, and2,4-dichlorophenoxyacetate (2,4-D) have been obtained by using bacterialgenes encoding phosphinothricin acetyltransferase, a nitrilase, or a2,4-dichlorophenoxyacetate monooxygenase, which detoxify the respectiveherbicides.

[0059] Selectable marker genes include, but are not limited to, genesencoding: neomycin phosphotransferase II (Fraley et al., CRC CriticalReviews in Plant Science 4, 1 (1986)); cyanamide hydratase(Maier-Greiner et al., Proc. Natl. Acad. Sci. USA 88, 4250 (1991));aspartate kinase; dihydrodipicolinate synthase (Perl et al.,BioTechnology 11, 715 (1993)); bar gene (Toki et al., Plant Physiol.100, 1503 (1992); Meagher et al., Crop Sci. 36, 1367 (1996));tryptophane decarboxylase (Goddijn et al., Plant Mol. Biol. 22, 907(1993)); neomycin phosphotransferase (NEO; Southern et al., J. Mol.Appl. Gen. 1, 327 (1982)); hygromycin phosphotransferase (HPT or HYG;Shimizu et al., Mol. Cell. Biol. 6, 1074 (1986)); dihydrofolatereductase (DHFR); phosphinothricin acetyltransferase (DeBlock et al.,EMBO J. 6, 2513 (1987)); 2,2-dichloropropionic acid dehalogenase(Buchanan-Wollatron et al., J. Cell. Biochem. 13D, 330 (1989));acetohydroxyacid synthase (U.S. Pat. No. 4,761,373 to Anderson et al.;Haughn et al., Mol. Gen. Genet. 221, 266 (1988));5-enolpyruvyl-shikimate-phosphate synthase (aroA; Comai et al., Nature317, 741 (1985)); haloarylnitrilase (WO 87/04181 to Stalker et al.);acetyl-coenzyme A carboxylase (Parker et al., Plant Physiol. 92, 1220(1990)); dihydropteroate synthase (sulI; Guerineau et al., Plant Mol.Biol. 15, 127 (1990)); and 32 kDa photosystem II polypeptide (psbA;Hirschberg et al., Science 222, 1346 (1983)).

[0060] Also included are genes encoding resistance to: chloramphenicol(Herrera-Estrella et al., EMBO J. 2, 987 (1983)); methotrexate(Herrera-Estrella et al., Nature 303, 209 (1983); Meijer et al., PlantMol. Biol. 16, 807 (1991)); hygromycin (Waldron et al., Plant Mol. Biol.5, 103 (1985); Zhijian et al., Plant Science 108, 219 (1995); Meijer etal., Plant Mol. Bio. 16, 807 (1991)); streptomycin (Jones et al., Mol.Gen. Genet. 210, 86 (1987)); spectinomycin (Bretagne-Sagnard et al.,Transgenic Res. 5, 131 (1996)); bleomycin (Hille et al., Plant Mol.Biol. 7, 171 (1986)); sulfonamide (Guerineau et al., Plant Mol. Bio. 15,127 (1990); bromoxynil (Stalker et al., Science 242, 419 (1988)); 2,4-D(Streber et al., Bio/Technology 7, 811 (1989)); phosphinothricin(DeBlock et al., EMBO J. 6, 2513 (1987)); spectinomycin(Bretagne-Sagnard and Chupeau, Transgenic Research 5, 131 (1996)).

[0061] The bar gene confers herbicide resistance to glufosinate-typeherbicides, such as phosphinothricin (PPT) or bialaphos, and the like.As noted above, other selectable markers that could be used in thevector constructs include, but are not limited to, the pat gene, alsofor bialaphos and phosphinothricin resistance, the ALS gene forimidazolinone resistance, the HPH or HYG gene for hygromycin resistance,the EPSP synthase gene for glyphosate resistance, the Hm1 gene forresistance to the Hc-toxin, and other selective agents used routinelyand known to one of ordinary skill in the art. See generally, Yarranton,Curr. Opin. Biotech. 3, 506 (1992); Chistopherson et al., Proc. Natl.Acad. Sci. USA 89, 6314 (1992); Yao et al., Cell 71, 63 (1992);Reznikoff, Mol. Microbiol. 6, 2419 (1992); Barkley, et al., The Operon177-220 (1980); Hu et al., Cell 48, 555 (1987); Brown et al., Cell 49,603 (1987); Figge et al., Cell 52, 713 (1988); Deuschle et al., Proc.Natl. Acad. Sci. USA 86, 5400 (1989); Fuerst et al., Proc. Natl. Acad.Sci. USA 86, 2549 (1989); Deuschle et al., Science 248, 480 (1990);Labow et al., Mol. Cell. Biol. 10, 3343 (1990); Zambretti et al., Proc.Natl. Acad. Sci. USA 89, 3952 (1992); Baim et al., Proc. Natl. Acad.Sci. USA 88, 5072 (1991); Wyborski et al., Nuc. Acids Res. 19, 4647(1991); Hillenand-Wissman, Topics in Mol And Struc. Biol. 10, 143(1989); Degenkolb et al., Antimicrob. Agents Chemother. 35, 1591 (1991);Kleinschnidt et al., Biochemistry 27, 1094 (1988); Gatz et al., Plant J.2, 397 (1992); Gossen et al., Proc. Natl. Acad. Sci. USA 89, 5547(1992); Oliva et al., Antimicrob. Agents Chemother. 36, 913 (1992);Hlavka et al., Handbook of Experimental Pharmacology 78, (1985); andGill et al., Nature 334, 721 (1988). The disclosures described hereinare incorporated by reference.

[0062] The above list of selectable marker genes are not meant to belimiting. Any selectable marker gene can be used in the presentinvention.

[0063] In view of the foregoing, it is apparent that one aspect of thepresent invention are transformed plant cells comprising the syntheticLexA DNA binding domain of the present invention. “Transformation”, asdefined herein, describes a process by which heterologous nucleic acidenters and changes a recipient cell. It may occur under natural orartificial conditions using various methods well known in the art.Transformation may rely on any known method for the insertion of foreignnucleic acid sequences into a eukaryotic host cell. Such “transformed”cells include stably transformed cells in which the inserted DNA iscapable of replication either as an autonomously replicating plasmid oras part of the host chromosome. They also include cells whichtransiently express the inserted DNA or RNA for limited periods of time.

[0064] In a preferred embodiment of the invention, recipient cells fortransformation are plant cells, more preferably dicot plant cells, evenmore preferably Arabidopsis species plant cells, and most preferablyArabidopsis thaliana plant cells. “Plant cells” as used herein includesplant cells in plant tissue or plant tissue and plant cells andprotoplasts in culture. Plant tissue includes differentiated andundifferentiated tissues of plants, including but not limited to, roots,shoots, leaves, pollen, seeds, tumor tissue and various forms of cellsin culture, such as single cells, protoplasts, embryos and callustissue. The plant tissue may be in plant, or in organ, tissue or cellculture.

[0065] The recombinant DNA molecule carrying the synthetic LexA DNAbinding domain of the invention and optionally a structural gene underpromoter control can be introduced into plant tissue by any means knownto those skilled in the art. The technique used for a given plantspecies or specific type of plant tissue depends on the known successfultechniques. As novel means are developed for the stable insertion offoreign genes into plant cells and for manipulating the modified cells,skilled artisans will be able to select from known means to achieve adesired result. Means for introducing recombinant DNA into plant tissueinclude, but are not limited to, direct DNA uptake (Paszkowski, J. etal. (1984) EMBO J. 3,2717), electroporation (Fromm, M., et al. Proc.Natl. Acad. Sci. USA 82,5824 (1985), microinjection (Crossway, A. et al.Mol. Gen. Genet. 202, 179 (1986)), or T-DNA mediated transfer fromAgrobacterium tumefaciens to the plant tissue, which techniques areknown in the art. There appears to be no fundamental limitation of T-DNAtransformation to the natural host range of Agrobacterium.Representative T-DNA vector systems are described in the followingreferences: An, G. et al. EMBO J. 4, 277 (1985); Herrera-Estrella, L. etal., Nature 303, 209 (1983); Herrera-Estrella, L. et al. EMBO J. 2, 987(1983); Herrera-Estrella, L. et al. in Plant Genetic Engineering, NewYork: Cambridge University Press, p. 63 (1985). Once introduced into theplant tissue, the expression of the structural gene may be assayed byany means known to the art, and expression may be measured as mRNAtranscribed or as protein synthesized, as provided herein.

[0066] Transgenic plants comprising the synthetic LexA DNA bindingdomain of the present invention (as present, for example, in a DNAconstruct of the present invention, or a transformed cells of thepresent invention) are also an aspect of the present invention.Procedures for cultivating transformed cells to useful cultivars areknown to those skilled in the art. Techniques are known for the in vitroculture of plant tissue, and in a number of cases, for regeneration intowhole plants. A further aspect of the invention are plant tissue, plantsor seeds containing the chimeric DNA sequences described above.Preferred are plant tissues, plants or seeds containing those chimericDNA sequences which are mentioned as being preferred.

[0067] The invention thus relates, in certain embodiments, to transgenicplants comprising the synthetic LexA DNA binding domain of the presentinvention. As used herein, the term “transgenic plants” is intended torefer to plants that have incorporated DNA sequences, including but notlimited to genes which are perhaps not normally present, DNA sequencesnot normally transcribed into RNA or translated into a protein(“expressed” ), or any other genes or DNA sequences which one desires tointroduce into the non-transformed plant, such as genes which maynormally be present in the non-transformed plant but which one desiresto either genetically engineer or to have altered expression. It iscontemplated that in some instances the genome of transgenic plants ofthe present invention will have been augmented through the stableintroduction of the transgene. However, in other instances, theintroduced gene will replace an endogenous sequence.

[0068] One example of a transgenic plant of the present invention may beprepared by the process of creating a translational fusion protein geneand introducing the fusion into a plant. One example of a plant to betransformed by the methods of the invention is a wild-type Arabidopsis.A specific illustration of the method of the invention would involveconstructing gene fusions comprising the LexA DNA binding domainoptimized for usage in Arabidopsis thaliana and a DNA segment encodingan exogenous protein one desires to express, and introducing the fusioninto wild-type Arabidopsis.

[0069] Alternatively, the gene introduced into the transformed plantline will comprise an exogenous protein gene operatively linked to itsown promoter or another promoter that is active in plants. This genewill contain the cloned LexA DNA binding domain optimized forArabidposis thaliana in a DNA construct for the purpose of increasingthe expression of that exogenous protein gene.

[0070] The transformed cells, identified by selection or screening andcultured in an appropriate medium that supports regeneration as providedherein, will then be allowed to mature into plants. Plants arepreferably matured either in a growth chamber or greenhouse. Plants areregenerated from about 6 weeks to 10 months after a transformant isidentified, depending on the initial tissue. During regeneration, cellsare grown on solid media in tissue culture vessels. Illustrativeembodiments of such vessels are petri dishes and Plant Con®s. After theregenerating plants have reached the stage of shoot and rootdevelopment, they may be transferred to a greenhouse for further growthand testing. Progeny may be recovered from the transformed plants andtested for expression of the exogenous expressible gene by localizedapplication of an appropriate substrate to plant parts such as leaves.

[0071] The regenerated plants are screened for transformation bystandard methods illustrated below. Progeny of the regenerated plants iscontinuously screened and selected for the continued presence of theintegrated DNA sequence in order to develop improved plant and seedlines. The DNA sequence can be moved into other genetic lines by avariety of techniques, including classical breeding, protoplast fusion,nuclear transfer and chromosome transfer.

[0072] After effecting delivery of heterologous DNA to recipient cellsand plants by any of the methods discussed above, identifying the cellsexhibiting successful or enhanced expression of a heterologous gene forfurther culturing and plant regeneration generally occurs. As mentionedabove, in order to improve the ability to identify transformants, onemay desire to employ a selectable or screenable marker gene as, or inaddition to, the expressible gene of interest. In this case, one wouldthen generally assay the potentially transformed cell population byexposing the cells to a selective agent or agents, or one would screenthe cells for the desired marker gene.

[0073] “Screening” generally refers to identifying the cells exhibitingexpression of a heterologous gene that has been transformed into theplant. Usually, screening is carried out to select successfullytransformed seeds (i.e., transgenic seeds) for further cultivation andplant generation (i.e., for the production of transgenic plants). Asmentioned above, in order to improve the ability to identifytransformants, one may desire to employ a selectable or screenablemarker gene as, or in addition to, the heterologous gene of interest. Inthis case, one would then generally assay the potentially transformedcells, seeds or plants by exposing the cells, seeds, plants, orseedlings to a selective agent or agents, or one would screen the cells,seeds, plants or tissues of the plants for the desired marker gene. Forexample, transgenic cells, seeds or plants may be screened underselective conditions, such as by growing the seeds or seedlings on mediacontaining selective agents, such as antibiotics (e.g., hygromycin,kanamycin, paromomycin or BASTA®), the successfully transformed plantshaving been transformed with genes encoding resistance to such selectiveagents.

[0074] To additionally confirm the presence of the heterologous nucleicacid or “transgene(s)” in the seeds of the cultivated plant or the inthe regenerated plants produced from those seeds, a variety of assaysmay be performed. Such assays include, for example, molecular biologicalassays, such as Southern and Northern blotting and PCR; biochemicalassays, such as detecting the presence of a protein product, e.g., byimmunological means (ELISAs and Western blots) or by enzymatic function;by plant part assays, such as leaf or root assays; and also, byanalyzing the phenotype of the whole regenerated plant.

[0075] While Southern blotting and PCR may be used to detect the gene(s)in question, they do not provide information as to whether the gene isbeing expressed. Expression of the heterologous gene may be evaluated byspecifically identifying the protein products of the introduced genes orevaluating the phenotypic changes brought about by their expression.

[0076] Assays for the production and identification of specific proteinsmay make use of physical-chemical, structural, functional, or otherproperties of the proteins. Unique physical-chemical or structuralproperties allow the proteins to be separated and identified byelectrophoretic procedures, such as native or denaturing gelelectrophoresis or isoelectric focusing, or by chromatographictechniques such as ion exchange or gel exclusion chromatography. Theunique structures of individual proteins offer opportunities for use ofspecific antibodies to detect their presence in formats such as an ELISAassay. Combinations of approaches may be employed with even greaterspecificity such as Western blotting in which antibodies are used tolocate individual gene products that have been separated byelectrophoretic techniques. Additional techniques may be employed toabsolutely confirm the identity of the product of interest such asevaluation by amino acid sequencing following purification. Althoughthese techniques are among the most commonly employed, other proceduresare known in the art and may be additionally used.

[0077] The following Examples are provided to illustrate the presentinvention, and should not be construed as limiting thereof.

EXAMPLE 1 Production of the LexA DNA Binding Domain Adapted for usage inArabidopsis

[0078] The 86-amino acid residues of the E. Coli DNA-binding domain ofLexA were translated back to a sequence of nucleic acids, substitutingthe E. Coli codon with the codons that are most frequently utilized inArabidopsis thaliana, according to K. Wada, et al., Nucleic Acids Res.19, 1981-1986 (1991).

[0079] The result of the translation back was a DNA sequence of 258nucleic acid residues. The sequence representing one strand of adouble-stranded DNA encoding the LexA DNA binding domain was dividedinto three segments of 84, 90 and 84 nucleotides, respectively. The DNAsequence of 258 nucleic acid residues representing the other (i.e.,complementary) strand of the double-stranded DNA encoding the LexA DNAbinding domain was divided into four segments of 51, 81, 81 and 45nucleic acid residues. The DNA sequences of the three segments of thefirst strand were as follows:AGCTTCATATGAAGGCTCTTACCGCTAGACAGCAGGAGGTTTTCGATCTTATC [SEQ ID NO:3]AGAGATCACATCTCTCAGACCGGAATGCCACCAACCAGAGCTGAGATCGCTCAGAGACTTGGATTCAGATCTCCAAACGCTGCTGAGGAGC [SEQ ID NO:4]ACCTTAAGGCTCTTGCTAGAAAGGGAGTTATCGAGATCGTTTCTGGAGCTTCTAGAGGAATCAGACTTCTTCAGGAGGAGGAGGAGGGA [SEQ ID NO:5]CTTCCACTTGTTGGAAGAGTTGCTGCTGGAGAGG The DNA sequences of the foursegments of the second strand were as follows:ATCTCTGATAAGATCGAAAACCTGCTGCTGTCTAGCGGTAAGAGCCTTCATATGA [SEQ ID NO:6]CTCAGCAGCGTTTGGAGATCTGAATCCAAGTCTCTGAGCGATCTCAGCTCTG [SEQ ID NO:7]GTTGGTGGCATTCCGGTCTGAGAGATGTGCTCCTGAAGAAGTCTGATTCCTCTAGAAGCTCCAGAAACGATCTCGATAACTC [SEQ ID NO:8]CCTTTCTAGCAAGAGCCTTAAGGTGCTCGATCCCTCTCCAGCAGCAACTCTTCCAACAAGTGGAAGTCCCTGCTCCTC [SEQ ID NO:9]

[0080] The seven sequences described above were chemically synthesizedas DNA oligomers on a modified ABI 391 machine, using standardβ-cyano-ethyl chemistry, each oligomer being phosphorylated at their 5′ends. Four of the oligomers also had additional nucleotides added totheir ends to create restrictions sites (“sticky ends”) for the purposeof ligating an annealed DNA product into a cloning vector that has beendigested with the restriction enzymes HindIII and BamHI.

[0081] The seven oligomers were then annealed together. For annealing ofthe seven oligomers with each other, the desiccated oligomers weredissolved at high concentration (10 OD₂₆₀ units/100 μl) in STE buffer(50 mM NaCl, 10 mM Tris pH 8.0, 1 mM EDTA). Equal volumes (12 μl each)were mixed in a 1.5 ml centrifuge tube and heated in a water bath to 94°C., then slowly cooled down to room temperature over about 3 hours by“unplugging” the water bath (i.e., by allowing the water bath to reachroom temperature naturally). The DNA was precipitated by adding 0.5volumes (42 μl) of 7.5 M ammonium acetate and 0.03 volumes of 1 M MgCl₂and 2.5 volumes of ethanol with mixing, followed by incubation at 4° C.for 12 hours, and centrifugation for 15 minutes at room temperature (RT)at 14,000 rpm. The DNA pellet was washed once with 500 μl of 70%ethanol, then air-dried and resuspended in 100 μl TE (10 mM Tris pH 8.0,1 mM EDTA).

[0082] Annealing of the seven oligonucleotides yielded the artificialgene sequence provided above as SEQ ID NO: 1, which translate into theartificial protein (amino acid sequence) set forth above as SEQ ID NO:2. The annealed DNA was then used for cloning into HindIII and BamHIsites of the plasmid pUCNLSTBP1 (SEQ ID NO: 11), illustrated in FIG. 3.

[0083] The foregoing is illustrative of the present invention and is notto be construed as limiting thereof. The invention is defined by thefollowing claims, with equivalents of the claims to be included therein.

1 11 1 258 DNA Artificial Sequence LexA DNA binding domain of E. coliusing Arabidopsis-optimized codons 1 atg aag gct ctt acc gct aga cag caggag gtt ttc gat ctt atc aga 48 Met Lys Ala Leu Thr Ala Arg Gln Gln GluVal Phe Asp Leu Ile Arg 1 5 10 15 gat cac atc tct cag acc gga atg ccacca acc aga gct gag atc gct 96 Asp His Ile Ser Gln Thr Gly Met Pro ProThr Arg Ala Glu Ile Ala 20 25 30 cag aga ctt gga ttc aga tct cca aac gctgct gag gag cac ctt aag 144 Gln Arg Leu Gly Phe Arg Ser Pro Asn Ala AlaGlu Glu His Leu Lys 35 40 45 gct ctt gct aga aag gga gtt atc gag atc gtttct gga gct tct aga 192 Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val SerGly Ala Ser Arg 50 55 60 gga atc aga ctt ctt cag gag gag gag gag gga cttcca ctt gtt gga 240 Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu ProLeu Val Gly 65 70 75 80 aga gtt gct gct gga gag 258 Arg Val Ala Ala GlyGlu 85 2 86 PRT Artificial Sequence LexA DNA binding domain proteinencoded by E. coli, containing codons optimized for Arabidopsis 2 MetLys Ala Leu Thr Ala Arg Gln Gln Glu Val Phe Asp Leu Ile Arg 1 5 10 15Asp His Ile Ser Gln Thr Gly Met Pro Pro Thr Arg Ala Glu Ile Ala 20 25 30Gln Arg Leu Gly Phe Arg Ser Pro Asn Ala Ala Glu Glu His Leu Lys 35 40 45Ala Leu Ala Arg Lys Gly Val Ile Glu Ile Val Ser Gly Ala Ser Arg 50 55 60Gly Ile Arg Leu Leu Gln Glu Glu Glu Glu Gly Leu Pro Leu Val Gly 65 70 7580 Arg Val Ala Ala Gly Glu 85 3 92 DNA E. coli 3 agcttcatat gaaggctcttaccgctagac agcaggaggt tttcgatctt atcagagatc 60 acatctctca gaccggaatgccaccaacca ga 92 4 90 DNA E. coli 4 gctgagatcg ctcagagact tggattcagatctccaaacg ctgctgagga gcaccttaag 60 gctcttgcta gaaagggagt tatcgagatc 905 85 DNA E. coli 5 gtttctggag cttctagagg aatcagactt cttcaggaggaggaggaggg acttccactt 60 gttggaagag ttgctgctgg agagg 85 6 55 DNA E. coli6 atctctgata agatcgaaaa cctcctgctg tctagcggta agagccttca tatga 55 7 81DNA E. coli 7 ctcagcagcg tttggagatc tgaatccaag tctctgagcg atctcagctctggttggtgg 60 cattccggtc tgagagatgt g 81 8 81 DNA E. coli 8 ctcctgaagaagtctgattc ctctagaagc tccagaaacg atctcgataa ctccctttct 60 agcaagagccttaaggtgct c 81 9 51 DNA E. coli 9 gatccctctc cagcagcaac tcttccaacaagtggaagtc cctccctcct c 51 10 20 DNA E. coli 10 catactgtat gagcatacag 2011 3307 DNA Artificial Sequence cloning vector 11 gacgaaaggg cctcgtgatacgcctatttt tataggttaa tgtcatgata ataatggttt 60 cttagacgtc aggtggcacttttcggggaa atgtgcgcgg aacccctatt tgtttatttt 120 tctaaataca ttcaaatatgtatccgctca tgagacaata accctgataa atgcttcaat 180 aatattgaaa aaggaagagtatgagtattc aacatttccg tgtcgccctt attccctttt 240 ttgcggcatt ttgccttcctgtttttgctc acccagaaac gctggtgaaa gtaaaagatg 300 ctgaagatca gttgggtgcacgagtgggtt acatcgaact ggatctcaac agcggtaaga 360 tccttgagag ttttcgccccgaagaacgtt ttccaatgat gagcactttt aaagttctgc 420 tatgtggcgc ggtattatcccgtattgacg ccgggcaaga gcaactcggt cgccgcatac 480 actattctca gaatgacttggttgagtact caccagtcac agaaaagcat cttacggatg 540 gcatgacagt aagagaattatgcagtgctg ccataaccat gagtgataac actgcggcca 600 acttacttct gacaacgatcggaggaccga aggagctaac cgcttttttg cacaacatgg 660 gggatcatgt aactcgccttgatcgttggg aaccggagct gaatgaagcc ataccaaacg 720 acgagcgtga caccacgatgcctgtagcaa tggcaacaac gttgcgcaaa ctattaactg 780 gcgaactact tactctagcttcccggcaac aattaataga ctggatggag gcggataaag 840 ttgcaggacc acttctgcgctcggcccttc cggctggctg gtttattgct gataaatctg 900 gagccggtga gcgtgggtctcgcggtatca ttgcagcact ggggccagat ggtaagccct 960 cccgtatcgt agttatctacacgacgggga gtcaggcaac tatggatgaa cgaaatagac 1020 agatcgctga gataggtgcctcactgatta agcattggta actgtcagac caagtttact 1080 catatatact ttagattgatttaaaacttc atttttaatt taaaaggatc taggtgaaga 1140 tcctttttga taatctcatgaccaaaatcc cttaacgtga gttttcgttc cactgagcgt 1200 cagaccccgt agaaaagatcaaaggatctt cttgagatcc tttttttctg cgcgtaatct 1260 gctgcttgca aacaaaaaaaccaccgctac cagcggtggt ttgtttgccg gatcaagagc 1320 taccaactct ttttccgaaggtaactggct tcagcagagc gcagatacca aatactgtcc 1380 ttctagtgta gccgtagttaggccaccact tcaagaactc tgtagcaccg cctacatacc 1440 tcgctctgct aatcctgttaccagtggctg ctgccagtgg cgataagtcg tgtcttaccg 1500 ggttggactc aagacgatagttaccggata aggcgcagcg gtcgggctga acggggggtt 1560 cgtgcacaca gcccagcttggagcgaacga cctacaccga actgagatac ctacagcgtg 1620 agctatgaga aagcgccacgcttcccgaag ggagaaaggc ggacaggtat ccggtaagcg 1680 gcagggtcgg aacaggagagcgcacgaggg agcttccagg gggaaacgcc tggtatcttt 1740 atagtcctgt cgggtttcgccacctctgac ttgagcgtcg atttttgtga tgctcgtcag 1800 gggggcggag cctatggaaaaacgccagca acgcggcctt tttacggttc ctggcctttt 1860 gctggccttt tgctcacatgttctttcctg cgttatcccc tgattctgtg gataaccgta 1920 ttaccgcctt tgagtgagctgataccgctc gccgcagccg aacgaccgag cgcagcgagt 1980 cagtgagcga ggaagcggaagagcgcccaa tacgcaaacc gcctctcccc gcgcgttggc 2040 cgattcatta atgcagctggcacgacaggt ttcccgactg gaaagcgggc agtgagcgca 2100 acgcaattaa tgtgagttagctcactcatt aggcacccca ggctttacac tttatgcttc 2160 cggctcgtat gttgtgtggaattgtgagcg gataacaatt tcacacagga aacagctatg 2220 accatgatta cgccaagcttatcgatcgga tccgctcttg ctccaaagaa gaagagaaag 2280 gttgctcttg ctggtaccatgactgatcaa ggattggaag ggagtaatcc agttgatctt 2340 agcaagcatc cttcagggattgttcctact cttcaaaaca ttgtctccac ggtgaactta 2400 gactgcaagc tagatcttaaagccatagct ttgcaggctc ggaatgctga atataatccc 2460 aagcgttttg ctgcggtgataatgaggatc agagaaccga agactacagc attaatattc 2520 gcctcaggga aaatggtctgtactggagct aagagcgagg acttttcgaa gatggctgct 2580 agaaagtatg ctaggattgtgcagaaattg ggattccctg caaaattcaa ggatttcaag 2640 attcagaata ttgtaggttcttgtgatgtc aaattcccta taagacttga aggtcttgct 2700 tactctcacg ctgctttctcaagttatgag cccgagctct tcccagggct gatttatagg 2760 atgaaagtcc caaaaatcgtccttctaatc tttgtctctg ggaagatcgt aataacagga 2820 gccaagatga gagatgagacctacaaagcc tttgagaata tataccccgt gctctcggaa 2880 ttcagaaaga tacagcaatagcctaggaat tcactggccg tcgttttaca acgtcgtgac 2940 tgggaaaacc ctggcgttacccaacttaat cgccttgcag cacatccccc tttcgccagc 3000 tggcgtaata gcgaagaggcccgcaccgat cgcccttccc aacagttgcg cagcctgaat 3060 ggcgaatggc gcctgatgcggtattttctc cttacgcatc tgtgcggtat ttcacaccgc 3120 atatggtgca ctctcagtacaatctgctct gatgccgcat agttaagcca gccccgacac 3180 ccgccaacac ccgctgacgcgccctgacgg gcttgtctgc tcccggcatc cgcttacaga 3240 caagctgtga ccgtctccgggagctgcatg tgtcagaggt tttcaccgtc atcaccgaaa 3300 cgcgcga 3307

That which is claimed is:
 1. A synthetic nucleotide sequence encodingthe LexA DNA binding domain, comprising at least one codon optimized forusage by an Arabidopsis species.
 2. A synthetic nucleotide sequenceaccording to claim 1, wherein over about 50% of the codons of thesequence are optimized for usage by an Arabidopsis species.
 3. Asynthetic nucleotide sequence according to claim 1, wherein over about80% of the codons of the sequence are optimized for usage by anArabidopsis species.
 4. A synthetic nucleotide sequence according toclaim 1, wherein the nucleotide sequence has the sequence SEQ ID NO: 1.5. A synthetic nucleotide sequence according to claim 1, wherein thesynthetic nucleotide sequence is optimized for usage by Arabidopsisthaliana.
 6. A synthetic LexA DNA binding domain protein codon optimizedfor usage by an Arabidopsis species.
 7. A synthetic LexA DNA bindingdomain protein according to claim 6, wherein the synthetic LexA DNAbinding domain protein is codon optimized for usage by Arabidopsisthaliana.
 8. A synthetic LexA DNA binding domain protein according toclaim 6 that has the amino acid sequence SEQ ID NO:
 2. 9. A DNAconstruct comprising the synthetic nucleic acid sequence of claim
 1. 10.A DNA construct according to claim 9, wherein the synthetic nucleic acidsequence has the sequence of SEQ ID NO:
 1. 11. A DNA construct accordingto claim 9, further comprising a heterologous nucleic acid sequence. 12.An eukaryotic cell comprising the DNA construct of claim
 10. 13. Aneukaryotic cell comprising the DNA construct of claim
 11. 14. Aneukaryotic cell according to claim 12, wherein the eukaryotic cell is aplant cell.
 15. An eukaryotic cell according to claim 14, wherein theplant cell is a dicot cell.
 16. An eukaryotic cell according to claim15, wherein the dicot cell is an Arabidopsis thaliana cell.
 17. Atransgenic plant comprising a cell of claim
 13. 18. A transgenic plantaccording to claim 17, wherein the plant is a dicot.
 19. A transgenicplant according to claim 18, wherein the plant is an Arabidopsisthaliana plant.
 20. A transgenic seed produced by the plant of claim 17.